https://rss.onlinelibrary.wiley.com/doi/10.1111/j.1740-9713.2018.01169.x
http://www.youtube.com/watch?v=XcBLEVknqvY
https://www.rstudio.com/products/rstudio/download/
https://moderndive.com/2-getting-started.html
https://cran.r-project.org/web/packages/addinslist/README.html
https://rstudio.github.io/rstudioaddins/
# devtools::install_github("rstudio/addinexamples", type = "source")
Aynı şeyi çok fazla şekilde yapmak mümkün
R Syntax Comparison::CHEAT SHEET
https://www.amelia.mn/Syntax-cheatsheet.pdf
I love the #rstats community.
— Frank Elavsky ᴰᵃᵗᵃ ᵂᶦᶻᵃʳᵈ (@Frankly_Data) July 3, 2018
Someone is like, “oh hey peeps, I saw a big need for this mundane but difficult task that I infrequently do, so I created a package that will literally scrape the last bits of peanut butter out of the jar for you. It's called pbplyr.”
What a tribe.
https://blog.mitchelloharawild.com/blog/user-2018-feature-wall/
Available CRAN Packages By Name
https://cran.r-project.org/web/packages/available_packages_by_name.html
Bioconductor
https://www.bioconductor.org
RecommendR
http://recommendr.info/
pkgsearch
CRAN package search
https://github.com/metacran/pkgsearch
Awesome R
https://awesome-r.com/
# ?mean
# ??efetch
# help(merge)
# example(merge)
RDocumentation https://www.rdocumentation.org
R Package Documentation https://rdrr.io/
GitHub
Stackoverflow
How I use #rstats
— Emily Bovee (@ebovee09) August 10, 2018
h/t @ThePracticalDev pic.twitter.com/erRnTG0Ujr
http://cran.r-project.org/doc/contrib/Baggott-refcard-v2.pdf
https://www.rstudio.com/resources/cheatsheets/
https://github.com/qinwf/awesome-R#readme
https://twitter.com/hashtag/rstats?src=hash
Got a question to ask on @SlackHQ or post on @github? No time to read the long post on how to use reprex? Here is a 20-second gif for you to format your R codes nicely and for others to reproduce your problem. (An example from a talk given by @JennyBryan) #rstat pic.twitter.com/gpuGXpFIsX
— ZhiYang (@zhiiiyang) October 18, 2018
install.packages("tidyverse", dependencies = TRUE)
install.packages("jmv", dependencies = TRUE)
install.packages("questionr", dependencies = TRUE)
install.packages("Rcmdr", dependencies = TRUE)
install.packages("summarytools")
# install.packages("tidyverse", dependencies = TRUE)
# install.packages("jmv", dependencies = TRUE)
# install.packages("questionr", dependencies = TRUE)
# install.packages("Rcmdr", dependencies = TRUE)
# install.packages("summarytools")
require(tidyverse)
require(jmv)
require(questionr)
library(summarytools)
library(gganimate)
https://support.rstudio.com/hc/en-us/articles/218611977-Importing-Data-with-RStudio
Spreadsheet users using #rstats: where's the data?#rstats users using spreadsheets: where's the code?
— Leonard Kiefer (@lenkiefer) July 7, 2018
# library(nycflights13)
# summary(flights)
View(data)
data
head
tail
glimpse
str
skimr::skim()
questionr paketi kullanılacak
https://juba.github.io/questionr/articles/recoding_addins.html
summary()
mean
median
min
max
sd
table()
library(readr)
irisdata <- read_csv("data/iris.csv")
## Parsed with column specification:
## cols(
## Sepal.Length = col_double(),
## Sepal.Width = col_double(),
## Petal.Length = col_double(),
## Petal.Width = col_double(),
## Species = col_character()
## )
jmv::descriptives(
data = irisdata,
vars = "Sepal.Length",
splitBy = "Species",
freq = TRUE,
hist = TRUE,
dens = TRUE,
bar = TRUE,
box = TRUE,
violin = TRUE,
dot = TRUE,
mode = TRUE,
sum = TRUE,
sd = TRUE,
variance = TRUE,
range = TRUE,
se = TRUE,
skew = TRUE,
kurt = TRUE,
quart = TRUE,
pcEqGr = TRUE)
##
## DESCRIPTIVES
##
## Descriptives
## ─────────────────────────────────────────────────────
## Species Sepal.Length
## ─────────────────────────────────────────────────────
## N setosa 50
## versicolor 50
## virginica 50
## Missing setosa 0
## versicolor 0
## virginica 0
## Mean setosa 5.01
## versicolor 5.94
## virginica 6.59
## Std. error mean setosa 0.0498
## versicolor 0.0730
## virginica 0.0899
## Median setosa 5.00
## versicolor 5.90
## virginica 6.50
## Mode setosa 5.00
## versicolor 5.50
## virginica 6.30
## Sum setosa 250
## versicolor 297
## virginica 329
## Standard deviation setosa 0.352
## versicolor 0.516
## virginica 0.636
## Variance setosa 0.124
## versicolor 0.266
## virginica 0.404
## Range setosa 1.50
## versicolor 2.10
## virginica 3.00
## Minimum setosa 4.30
## versicolor 4.90
## virginica 4.90
## Maximum setosa 5.80
## versicolor 7.00
## virginica 7.90
## Skewness setosa 0.120
## versicolor 0.105
## virginica 0.118
## Std. error skewness setosa 0.337
## versicolor 0.337
## virginica 0.337
## Kurtosis setosa -0.253
## versicolor -0.533
## virginica 0.0329
## Std. error kurtosis setosa 0.662
## versicolor 0.662
## virginica 0.662
## 25th percentile setosa 4.80
## versicolor 5.60
## virginica 6.23
## 50th percentile setosa 5.00
## versicolor 5.90
## virginica 6.50
## 75th percentile setosa 5.20
## versicolor 6.30
## virginica 6.90
## ─────────────────────────────────────────────────────
# install.packages("scatr")
scatr::scat(
data = irisdata,
x = "Sepal.Length",
y = "Sepal.Width",
group = "Species",
marg = "dens",
line = "linear",
se = TRUE)
https://cran.r-project.org/web/packages/summarytools/vignettes/Introduction.html
# library(summarytools)
summarytools::freq(iris$Species, style = "rmarkdown")
Variable: iris$Species
Type: Factor (unordered)
| Freq | % Valid | % Valid Cum. | % Total | % Total Cum. | |
|---|---|---|---|---|---|
| setosa | 50 | 33.33 | 33.33 | 33.33 | 33.33 |
| versicolor | 50 | 33.33 | 66.67 | 33.33 | 66.67 |
| virginica | 50 | 33.33 | 100.00 | 33.33 | 100.00 |
| <NA> | 0 | 0.00 | 100.00 | ||
| Total | 150 | 100.00 | 100.00 | 100.00 | 100.00 |
freq(iris$Species, report.nas = FALSE, style = "rmarkdown", omit.headings = TRUE)
| Freq | % | % Cum. | |
|---|---|---|---|
| setosa | 50 | 33.33 | 33.33 |
| versicolor | 50 | 33.33 | 66.67 |
| virginica | 50 | 33.33 | 100.00 |
| Total | 150 | 100.00 | 100.00 |
with(tobacco, print(ctable(smoker, diseased), method = 'render'))
| diseased | |||
|---|---|---|---|
| smoker | Yes | No | Total |
| Yes | 125 (41.95%) | 173 (58.05%) | 298 (100.00%) |
| No | 99 (14.10%) | 603 (85.90%) | 702 (100.00%) |
| Total | 224 (22.40%) | 776 (77.60%) | 1000 (100.00%) |
Generated by summarytools 0.8.8 (R version 3.5.1)
2018-12-19
with(tobacco,
print(ctable(smoker, diseased, prop = 'n', totals = FALSE),
omit.headings = TRUE, method = "render"))
| diseased | ||
|---|---|---|
| smoker | Yes | No |
| Yes | 125 | 173 |
| No | 99 | 603 |
Generated by summarytools 0.8.8 (R version 3.5.1)
2018-12-19
summarytools::descr(iris, style = "rmarkdown")
Non-numerical variable(s) ignored: Species
Data Frame: iris
N: 150
| Sepal.Length | Sepal.Width | Petal.Length | Petal.Width | |
|---|---|---|---|---|
| Mean | 5.84 | 3.06 | 3.76 | 1.20 |
| Std.Dev | 0.83 | 0.44 | 1.77 | 0.76 |
| Min | 4.30 | 2.00 | 1.00 | 0.10 |
| Q1 | 5.10 | 2.80 | 1.60 | 0.30 |
| Median | 5.80 | 3.00 | 4.35 | 1.30 |
| Q3 | 6.40 | 3.30 | 5.10 | 1.80 |
| Max | 7.90 | 4.40 | 6.90 | 2.50 |
| MAD | 1.04 | 0.44 | 1.85 | 1.04 |
| IQR | 1.30 | 0.50 | 3.50 | 1.50 |
| CV | 0.14 | 0.14 | 0.47 | 0.64 |
| Skewness | 0.31 | 0.31 | -0.27 | -0.10 |
| SE.Skewness | 0.20 | 0.20 | 0.20 | 0.20 |
| Kurtosis | -0.61 | 0.14 | -1.42 | -1.36 |
| N.Valid | 150.00 | 150.00 | 150.00 | 150.00 |
| Pct.Valid | 100.00 | 100.00 | 100.00 | 100.00 |
descr(iris, stats = c("mean", "sd", "min", "med", "max"), transpose = TRUE,
omit.headings = TRUE, style = "rmarkdown")
Non-numerical variable(s) ignored: Species
| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| Sepal.Length | 5.84 | 0.83 | 4.30 | 5.80 | 7.90 |
| Sepal.Width | 3.06 | 0.44 | 2.00 | 3.00 | 4.40 |
| Petal.Length | 3.76 | 1.77 | 1.00 | 4.35 | 6.90 |
| Petal.Width | 1.20 | 0.76 | 0.10 | 1.30 | 2.50 |
view(dfSummary(iris))
Method 'viewer' only valid within RStudio. Switching method to 'browser'.
Output file written: /var/folders/76/rq_s_23s7fd5r8hqrbg8rmnc0000gp/T//Rtmp1Mxgoj/file3e4b6cfc08d1.html
dfSummary(tobacco, plain.ascii = FALSE, style = "grid")
tobacco
N: 1000
| No | Variable | Stats / Values | Freqs (% of Valid) | Text Graph | Valid | Missing |
|---|---|---|---|---|---|---|
| 1 | gender [factor] |
1. F 2. M |
489 (50.0%) 489 (50.0%) |
IIIIIIIIIIIIIIII IIIIIIIIIIIIIIII |
978 (97.8%) |
22 (2.2%) |
| 2 | age [numeric] |
mean (sd) : 49.6 (18.29) min < med < max : 18 < 50 < 80 IQR (CV) : 32 (0.37) |
63 distinct values | 975 (97.5%) |
25 (2.5%) |
|
| 3 | age.gr [factor] |
1. 18-34 2. 35-50 3. 51-70 4. 71 + |
258 (26.5%) 241 (24.7%) 317 (32.5%) 159 (16.3%) |
IIIIIIIIIIIII IIIIIIIIIIII IIIIIIIIIIIIIIII IIIIIIII |
975 (97.5%) |
25 (2.5%) |
| 4 | BMI [numeric] |
mean (sd) : 25.73 (4.49) min < med < max : 8.83 < 25.62 < 39.44 IQR (CV) : 5.72 (0.17) |
974 distinct values | 974 (97.4%) |
26 (2.6%) |
|
| 5 | smoker [factor] |
1. Yes 2. No |
298 (29.8%) 702 (70.2%) |
IIIIII IIIIIIIIIIIIIIII |
1000 (100%) |
0 (0%) |
| 6 | cigs.per.day [numeric] |
mean (sd) : 6.78 (11.88) min < med < max : 0 < 0 < 40 IQR (CV) : 11 (1.75) |
37 distinct values | 965 (96.5%) |
35 (3.5%) |
|
| 7 | diseased [factor] |
1. Yes 2. No |
224 (22.4%) 776 (77.6%) |
IIII IIIIIIIIIIIIIIII |
1000 (100%) |
0 (0%) |
| 8 | disease [character] |
1. Hypertension 2. Cancer 3. Cholesterol 4. Heart 5. Pulmonary 6. Musculoskeletal 7. Diabetes 8. Hearing 9. Digestive 10. Hypotension [ 3 others ] |
36 (16.2%) 34 (15.3%) 21 ( 9.5%) 20 ( 9.0%) 20 ( 9.0%) 19 ( 8.6%) 14 ( 6.3%) 14 ( 6.3%) 12 ( 5.4%) 11 ( 5.0%) 21 ( 9.5%) |
IIIIIIIIIIIIIIII IIIIIIIIIIIIIII IIIIIIIII IIIIIIII IIIIIIII IIIIIIII IIIIII IIIIII IIIII IIII IIIIIIIII |
222 (22.2%) |
778 (77.8%) |
| 9 | samp.wgts [numeric] |
mean (sd) : 1 (0.08) min < med < max : 0.86 < 1.04 < 1.06 IQR (CV) : 0.19 (0.08) |
0.86!: 267 (26.7%) 1.04!: 249 (24.9%) 1.05!: 324 (32.4%) 1.06!: 160 (16.0%) ! rounded |
IIIIIIIIIIIII IIIIIIIIIIII IIIIIIIIIIIIIIII IIIIIII |
1000 (100%) |
0 (0%) |
# First save the results
iris_stats_by_species <- by(data = iris,
INDICES = iris$Species,
FUN = descr, stats = c("mean", "sd", "min", "med", "max"),
transpose = TRUE)
# Then use view(), like so:
view(iris_stats_by_species, method = "pander", style = "rmarkdown")
Data Frame: iris
Group: Species = setosa
N: 50
| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| Sepal.Length | 5.01 | 0.35 | 4.30 | 5.00 | 5.80 |
| Sepal.Width | 3.43 | 0.38 | 2.30 | 3.40 | 4.40 |
| Petal.Length | 1.46 | 0.17 | 1.00 | 1.50 | 1.90 |
| Petal.Width | 0.25 | 0.11 | 0.10 | 0.20 | 0.60 |
Group: Species = versicolor
N: 50
| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| Sepal.Length | 5.94 | 0.52 | 4.90 | 5.90 | 7.00 |
| Sepal.Width | 2.77 | 0.31 | 2.00 | 2.80 | 3.40 |
| Petal.Length | 4.26 | 0.47 | 3.00 | 4.35 | 5.10 |
| Petal.Width | 1.33 | 0.20 | 1.00 | 1.30 | 1.80 |
Group: Species = virginica
N: 50
| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| Sepal.Length | 6.59 | 0.64 | 4.90 | 6.50 | 7.90 |
| Sepal.Width | 2.97 | 0.32 | 2.20 | 3.00 | 3.80 |
| Petal.Length | 5.55 | 0.55 | 4.50 | 5.55 | 6.90 |
| Petal.Width | 2.03 | 0.27 | 1.40 | 2.00 | 2.50 |
view(iris_stats_by_species)
Output file written: /var/folders/76/rq_s_23s7fd5r8hqrbg8rmnc0000gp/T//Rtmp1Mxgoj/file3e4b3ee1d2fd.html
Method 'viewer' only valid within RStudio. Switching method to 'browser'.
Output file appended: /var/folders/76/rq_s_23s7fd5r8hqrbg8rmnc0000gp/T//Rtmp1Mxgoj/file3e4b3ee1d2fd.html
data(tobacco) # tobacco is an example dataframe included in the package
BMI_by_age <- with(tobacco,
by(BMI, age.gr, descr,
stats = c("mean", "sd", "min", "med", "max")))
view(BMI_by_age, "pander", style = "rmarkdown")
Variable: tobacco$BMI by age.gr
| 18-34 | 35-50 | 51-70 | 71 + | |
|---|---|---|---|---|
| Mean | 23.84 | 25.11 | 26.91 | 27.45 |
| Std.Dev | 4.23 | 4.34 | 4.26 | 4.37 |
| Min | 8.83 | 10.35 | 9.01 | 16.36 |
| Median | 24.04 | 25.11 | 26.77 | 27.52 |
| Max | 34.84 | 39.44 | 39.21 | 38.37 |
BMI_by_age <- with(tobacco,
by(BMI, age.gr, descr, transpose = TRUE,
stats = c("mean", "sd", "min", "med", "max")))
view(BMI_by_age, "pander", style = "rmarkdown", omit.headings = TRUE)
| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| 18-34 | 23.84 | 4.23 | 8.83 | 24.04 | 34.84 |
| 35-50 | 25.11 | 4.34 | 10.35 | 25.11 | 39.44 |
| 51-70 | 26.91 | 4.26 | 9.01 | 26.77 | 39.21 |
| 71 + | 27.45 | 4.37 | 16.36 | 27.52 | 38.37 |
tobacco_subset <- tobacco[ ,c("gender", "age.gr", "smoker")]
freq_tables <- lapply(tobacco_subset, freq)
view(freq_tables, footnote = NA, file = 'freq-tables.html')
Output file written: freq-tables.html
Method 'viewer' only valid within RStudio. Switching method to 'browser'.
Output file appended: freq-tables.html
what.is(iris)
$properties property value 1 class data.frame 2 typeof list 3 mode list 4 storage.mode list 5 dim 150 x 5 6 length 5 7 is.object TRUE 8 object.type S3 9 object.size 7256 Bytes
$attributes.lengths names class row.names 5 1 150
$extensive.is [1] “is.data.frame” “is.list” “is.object” “is.recursive” [5] “is.unsorted”
freq(tobacco$gender, style = 'rmarkdown')
## ### Frequencies
## **Variable:** tobacco$gender
## **Type:** Factor (unordered)
##
## | | Freq | % Valid | % Valid Cum. | % Total | % Total Cum. |
## |-----------:|-----:|--------:|-------------:|--------:|-------------:|
## | **F** | 489 | 50.00 | 50.00 | 48.90 | 48.90 |
## | **M** | 489 | 50.00 | 100.00 | 48.90 | 97.80 |
## | **\<NA\>** | 22 | | | 2.20 | 100.00 |
## | **Total** | 1000 | 100.00 | 100.00 | 100.00 | 100.00 |
print(freq(tobacco$gender), method = 'render')
| Valid | Total | ||||
|---|---|---|---|---|---|
| gender | Freq | % | % Cumul | % | % Cumul |
| F | 489 | 50.00 | 50.00 | 48.90 | 48.90 |
| M | 489 | 50.00 | 100.00 | 48.90 | 97.80 |
| <NA> | 22 | 2.20 | 100.00 | ||
| Total | 1000 | 100.00 | 100.00 | 100.00 | 100.00 |
Generated by summarytools 0.8.8 (R version 3.5.1)
2018-12-19
library(skimr)
skim(df)